Search Results: "geissert"

6 March 2013

Raphael Geissert: A bashism a week: returning

Inspired by Thorsten Glaser's comment about where you can break from, this "bashism a week" is about a behaviour not implemented by bash.

return is a special built-in utility, and it should only be used on functions and scripts executed by the dot utility. That's what the POSIX:2001 specification requires.

If you return from any other scope, for example by accidentally calling it from a script that was not sourced but executed directly, the bash shell won't forgive you: it does not abort the execution of commands. This can lead to undesired behaviour.

A wide variety of shell interpreters silently handle such calls to return as if exit had been called.

An easy way to avoid such undesired behaviours is to follow the best practice of setting the e option, i.e.
set -e
. With that option set at the moment of calling return outside of the allowed scopes, bash will abort the execution, as desired.

The POSIX specification does not guarantee the above behaviour either as the result in such cases is "unspecified", however.

27 February 2013

Raphael Geissert: A bashism a week: appending

The very well known appending operator += is a bashism commonly found in the wild. Even though it can be used for things such as adding to integers (when the variable is declared as such) or appending to arrays, it is usually used for appending to a string variable.

As I previously blogged about it, the appending operator bashism is only useful when programming for the bash shell.

Whenever you want to append a string to a variable, repeating the name of the variable is the portable way. I.e.
foo=foo
foo="$ foo bar"
# Instead of foo+=" bar", which is a bashism

See? Replacing the += operator is not rocket science.

Note: One should be aware that makefiles do have a += operator which is safe to use when appending to a make variable. But don't let this "exception" fool you: code in configure.ac and similar files is executed by the shell interpreter. So don't use the appending operator there.

25 February 2013

Raphael Geissert: A tale of a bug report

Part 1:
A bug report is filed.
Part 2:
A patch is later provided by the submitter.
Part 3:
The patch is added to the package, the bug gets fixed.

[some time later]

Part 4:
A new upstream version is released, the patch is dropped.
Part 5:
The bug report is filed, again.

20 February 2013

Raphael Geissert: A bashism a week: pushing and pop'ing directories

Want to switch back-and-forth between directories in your shell script?
The bashism of this week can be of some help, but for most needs, the cd utility is more than enough.

pushd, popd, and the extra built-in dirs are bashisms that allow one to create and manipulate a stack of directory entries. For a simple, temporary, switch of directories the following code is portable as far as POSIX:2001 is concerned:

cd /some/directory
touch some files
unlink others
# etc
cd - >/dev/null
# We are now back at where we were before the first 'cd'

Which is equivalent to the following, also portable, code:

cd /some/directory
touch some files
unlink others
# etc
cd "$OLDPWD"
# We are now back at where we were before the first 'cd'

Multiple switches can also be implemented portably without storing the name of the directories in variables at the expense of using subshells (and their side-effects).

However, if you think you can solve your problem more conveniently by using "pushd" and "popd" don't forget to document the need of those built-ins and to adjust the shebang of your script to that of a shell that implements them, such as bash.

13 February 2013

Raphael Geissert: A bashism a week: negative matches

Probably due to the popular way of expressing the negation of a character class in regular expressions, it is common to see negative patterns such as [^b] in shell scripts.

However, using an expression such as [^b] where the shell is the one processing the pattern will cause trouble with shells that don't support that extension. The right way to express the negation is using an exclamation mark, as in: [!b]

Big fat note: this only applies to patterns that the shell is responsible for processing. Some of such cases are:

case foo in
[!m]oo)
echo bar
;;
esac
and
# everything but backups:
for file in documents/*[!~]; do
echo doing something with "$file" ...
done

If the pattern is processed by another program, beware that most won't interpret the exclamation the way the shell does. E.g.

$ printf "foo\nbar\nbaz\n"   grep '^[^b]'
foo
$ printf "foo\nbar\nbaz\n" grep '^[!b]'
bar
baz

6 February 2013

Raphael Geissert: A bashism a week: short-circuiting tests

The test/[ command is home to several bashisms, but as I believe I have demonstrated: incompatible behaviour is to be expected.

The "-a" and "-o" binary logical operators are no exception, even if documented by the Debian Policy Manual.

One feature of writing something like the following code, is that upon success of the first command, the second won't be executed: it will be short-circuited.
[ -e /dev/urandom ]   [ -e /dev/random ]

Now, using the "-a" or "-o" bashisms even in shell interpreters that support them can result in unexpected behaviour: some interpreters will short-circuit the second test, others won't.

For example, bash doesn't short-circuit:
$ strace bash -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1 grep /dev
stat64("/dev/urandom", ...) = 0
stat64("/dev/random", ...) = 0
Neither does dash:
$ strace dash -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1 grep /dev
stat64("/dev/urandom", ...) = 0
stat64("/dev/random", ...) = 0
But posh does:
$ strace posh -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1 grep /dev
stat64("/dev/urandom", ...) = 0
And so does pdksh:
$ strace pdksh -c '[ -e /dev/urandom -o -e /dev/random ]' 2>&1 grep /dev
stat64("/dev/urandom", ...) = 0

output of strace redacted for brevity

So even in Debian, where the feature can be expected to be implemented, its semantics are not very well defined. So much for using this bashism... better avoid it.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

30 January 2013

Raphael Geissert: A bashism a week: sleep

To delay execution of some commands in a shell script, the sleep command comes handy.
Even though many shells do not provide it as a built-in and the GNU sleep command is used, there are a couple of things to note:


This of course is regarding what is required by POSIX:2001; it only requires the sleep command to take an unsigned integer. FreeBSD's sleep command does accept fractions of seconds, for example.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

In this case, since the sleep command is not required to be a built-in, it does not matter what shell you specify in your script's shebang. Moreover, calling /bin/sleep doesn't guarantee you anything. The exception is if you specify a shell that has its own sleep built-in, then you could probably rely on it.

The easiest replacement for suffixes is calculating the desired amount of time in seconds. As for the second case, you may want to reconsider your use of a shell script.

23 January 2013

Raphael Geissert: A bashism a week: output redirection

Redirecting stdout and stderr to the same file or file descriptor with &> is common and nice, except that it is not required to be supported by POSIX:2001. Moreover, trying to use it with shells not supporting it will do exactly the opposite:

  1. The command's output (to stdout and stderr) won't be redirected anywhere.
  2. The command will be executed in the background.
  3. The file will be truncated, if redirecting to a file and not using >>.

Are the characters saved worth those effects? I don't think so. Just use this instead: "> file 2>&1". Make sure you get the first redirection right, "&> file 2>&1" isn't going to do the trick.

22 January 2013

Raphael Geissert: The death of the netbooks?

It's been over four years since I bought my ASUS Eee PC 1000h. I have used it almost daily ever since. Back when I bought it new models from different brands were being released every few months due to the netbooks hype.

In spite of being resource-limited due to its 1.60 GHz Atom CPU and only 1 GB of RAM, I've managed to do pretty much everything with it. Building software is slow and watching HD videos is nearly impossible, even more so when streamed from the internet and played with flash. Its limited memory capacity makes the kernel swap tens of megabytes before the KDE4 desktop is fully loaded. After launching some day-to-day applications there are usually hundreds of MBs in swap.

In spite of all this, I run the KDE4 desktop and have been able to do things such as running up to two Debian virtual machines with several services (apache httpd, mysql server, openldapd, squid, etc.) and a Windows XP one all at the same time, under virtualbox. I could have probably booted another Debian virtual machine but that would have most likely rendered the DE unusable. Oh, and did I say that this is under the "VT-x"-less N270 CPU?

This so-called netbook has proved to be rock-solid. Every component is still fully functional except for its 7-hours lasting battery that didn't stand a full year of day-to-day use. The keyboard is still intact and so is everything else.

Last year I thought I was going to have to seriously consider buying a replacement after seeing what I think are some signs of the end of its life: After a routinely deep cleanup the keyboard stopped working properly, to the point that I couldn't even login because half the keyboard would send the signal of a totally unrelated key. I bought an external, but still small, USB keyboard which I used until after the next deep-cleanup somehow made the built-in keyboard work again.

The second sign came soon after the keyboard issue. The AC adapter was, well, no longer supplying power to the machine. Trying to buy one online proved to be futile. Replacement supplies for ASUS equipment are hard to find here in Mexico and importing them from the US results in the item being twice (or more) as expensive due to import taxes. They are even more expensive when one finally adds up the cost of shipping.

Hopefully, after spending some hours hunting down the failure in the adapter it turned out to be a problem with the wires. Cut wires repaired, the adapter was working again. The unit itself wasn't at fault.

Back to 2013, this netbook is ageing and every time I've looked at potential replacements I've found none that I like. I look for another netbook/ultrabook/laptop/whatever that is rock-solid, with a 10.1" or 11" display, and has a similarly compact but not oh-so-small-that-I-can't-even-type-by-only-using-my-fingertips keyboard.

The only devices that have caught my eye are the ASUS transformers (with the dock). I'm not interested in a device that only has 1 GB of memory and something between 32 to 64 GB of storage, however. I'm limited enough with my eee's 160GB hdd.

For my needs, the pre-installed Android would have to go away and I guess it would be fun to get a transformer to run under a standard Debian linux kernel. Since I'm not interested in doing that kind of kernel work the transformers are out of the question.

Based on this I think I can only partially agree with Russell Coker when he states that
If tablet computers with hardware keyboards replace traditional Netbooks that's not really killing Netbooks but introducing a new version of the same thing.
Tablets with hardware keyboards may, perhaps, be the next generation of the less than 10" netbooks, but to date I've yet to see something with a display smaller than 13" that is an upgrade over the 1000h Eee I own.

21 January 2013

Raphael Geissert: January's Debian mirrors update

It's been slightly over a month since December's update to http.debian.net. Since then, Debian's mirrors network has grown by 6 more archive mirrors. Many thanks to the Debian sponsors running them!

There are now about 370 archive mirrors serving it over http, an increase of 40 (12%) since April last year. The number of backports mirrors is now at 82, and 25 for archive.debian.org.

On the http.debian.net front there haven't been many changes since last month. Some major changes are in the works, but they didn't make it into January's code update. There were, however, a few issues with one of the hosts during the first couple of days of January. Apologies for the inconveniences it may have caused.

A new version of ftpsync addressing some issues should hopefully be released some time next month. Stay tuned to the debian-mirrors mailing list for a call for testers and probably a survey for mirror administrators.

16 January 2013

Raphael Geissert: A bashism a week: ulimit

Setting resource limits from a shell script is commonly done with the ulimit command.
Shells provide it as a built-in, if they provide it at all. As far as I know, there is no non-built-in ulimit command. One could be implemented with the Linux-specific prlimit system call, but even that requires a fairly "recent" kernel version (circa 2010).

Depending on the kind of resource you want to limit, you may get away with what some shells such as dash provide: CPU, FSIZE, DATA, STACK, CORE, RSS, MEMLOCK, NPROC, NOFILE, AS, LOCKS. I.e. options tfdscmlpnvw, plus H for the hard limit, S for the soft limit, and a for all. Bash allows other resources to be limited.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time. ulimit is not required by POSIX:2001 to be implemented for the shell.

10 January 2013

Raphael Geissert: Email security vs. MUAs

So much for security, when using a "smart" MUA

Also note that the https URL is now http due to the email security best practice.

9 January 2013

Raphael Geissert: A bashism a week: brace expansion

Brace expansion is well known and handy, but sadly it is not required by POSIX:2001. Shells that don't support it will simply and silently leave it as is.

If you use it to shorten commands, as in "echo Debian GNU/ Linux,kFreeBSD ", you have to spell it out or use some sort of loop.

When using brace expansion for sequences you will usually have to fall back to using the seq command or using loops. " 1..9 " can be replaced with "seq -s ' ' 1 9", " 1..9..2 " to "seq -s ' ' 1 2 9", and so on.
If you use brace expansion for sequences of characters then seq won't be of much help.

I must note that the seq command is not required by POSIX:2001, however.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

2 January 2013

Raphael Geissert: A bashism a week: read

Whether for interacting with the caller, for reading the output of some command, or a file descriptor in general, the read shell command can be found in many scripts.

Unless you stick to the POSIX:2001-required "read variable_name", possibly with the -r option, you should expect problems.


dash, for instance, supports prompts but nothing else.

Remember, if you rely on any non-standard behaviour or feature make sure you document it and, if feasible, check for it at run-time.

24 December 2012

Raphael Geissert: A bashism a week: taking a break

Short notice: due to holidays and people, rightfully, not paying much attention to the online world, this Wednesday there won't be a post from the "a bashism a week" series.

Enjoy the break.

19 December 2012

Raphael Geissert: A bashism a week: testing for equality

Well known, yet easy to find just about everywhere: using the "test"/"[" commands to test for equality with two equals signs (==).

Contrary to many programming languages, if you want to test for equality in a shell script you must only use the equals sign once.

Try to keep this in mind: under a shell that implements what is required by POSIX:2001, you may hit the unexpected in the following code.

if [ foo == foo ]; then
echo expected
else
echo unexpected
fi

18 December 2012

Raphael Geissert: Nicer, but stricter

Lately I've been working on making the redirector nicer to the mirrors and to some potential users. More specifically, those behind a caching proxy.

The redirector is now nicer to traditional web proxies by redirecting to objects that are known not to change with a "moved permanently" code (HTTP status code 301.) This applies to files in the pool/ directory and ".pdiff" files, among others.
Previously, a traditional caching web proxy would sometimes end up with multiple copies of the same object, fetched from different mirrors; and the redirection would not be cached at all. With this change, this is no longer the case.

Using a caching proxy that is aware of the Debian repository design is still more likely to yield better results, however: If my memory serves correctly, apt-cacher has the ability of updating the Packages, Sources, and similar files with the ".pdiff"s on the server side. Apt-Cacher-NG apparently can use debdelta, and so on.
Check my blog post about one APT caching proxy not being efficient for some comments related to those tools.

Another recent change is that mirrors that can't be used by the redirector will no longer be monitored as often as the other mirrors. For instance, if a mirror doesn't generate a trace file (used for monitoring) then the redirector will gradually limit the rate at which the mirror is checked.
This rate-limiting mechanism applies to different kinds of errors, and should reduce the amount of wasted time and bandwidth while still allowing automatic-detection of mirrors that recover.


Projection of a rate-limited mirror over six weeks. The mirror would have to fail in every attempt for that to happen.
N.b. there's a bump in the scale.

The rate limiter applies an initial exception to allow temporary errors to not affect the use of the mirror by the redirector. After that exception, it is pretty much linear. However, that chart doesn't really reflect the effect of the rate limiter, so put in comparison with the normal checking behaviour:


Comparison of the two behaviours over an 8 weeks period using a logarithmic scale.
Nice chart colours by Libreoffice.

The code to detect mirrors that don't perform a two-stages sync that I talked about in a previous post has not yet been integrated as the current implementation would be too expensive on the mirrors to just add it as-is.

While tracking down problems exposed to users, I decided to take a stricter approach as to what mirrors are used by the redirector. Suffice to say that the remaining mirrors using the obsolete anonftpsync are going to be ignored entirely. ftpsync has been around for a few years now and it is the standard tool.
Whether you are mirroring Debian, Raspbian, Ubuntu, or any other Debian-like packages repository, ftpsync is the right tool to use.

Most of the issues I've been discovering, and sometimes working around, affect direct users of the mirrors and are not related to the http.debian.net redirector. When not detected beforehand they happen to be exposed by the redirector, but like I said, I plan to be stricter in order to increase the redirector's reliability. Once a strict and reliable foundation is built, more workarounds might see their way in to better use the available resources.

That's it for now. The road is long, the challenge is great, and being an observer in an uncontrolled environment makes it even more interesting.

12 December 2012

Raphael Geissert: A bashism a week: $RANDOM numbers

Commonly used to sleep a random amount of time or to create unique temporary file names, $RANDOM is one of those bashisms that you are best avoiding it altogether.

It is not uncommon to see scripts generating a "unique" temporary file name with code that goes like: tempf="/tmp/foo.$RANDOM", or tempf="/tmp/foo.$$.$RANDOM".

Under some shells the "unique" temporary file name will be "/tmp/foo." for the first example code. So much for randomness, right?

Even if you go around it by defining $RANDOM to the output of cksum after reading some bytes from /dev/urandom, please: don't do that. Use the mktemp command instead.
When creating temporary files there's more than just generating a file name. Just don't do it on your own: use mktemp. Really, use it, the list of those who weren't using mktemp (or similar) is large enough as it is.

Don't even dare to mention the linux kernel-based protection against symlink attacks. There's no excuse for not using mktemp.

Tip: If you are going to use multiple temporary files, create a temporary directory instead. Use mktemp -d.
Tip: Don't reuse a temporary file's name, even if you unlink/remove it. Generate a new one with mktemp.
Tip: Reusing also means doing things like tmp="$(mktemp)"; some_command > "$tmp.stdout" 2> "$tmp.stderr"
Tip: Even if $RANDOM is not empty, don't use it. It could have been exported as an environment variable. Again, just use mktemp.

For the remaining cases where you may want a pseudo random number such as for sleeping a random number of seconds: you can use something as simple as $$. Use shell arithmetic to adjust it as needed: use the modulo operator, multiply it, etc.

If you think you need something more "random" than the process' id, then you should probably not be using $RANDOM in the first place.

5 December 2012

Raphael Geissert: Introducing: a bashism a week

No matter how many scripting programming languages exist, it appears that shell programming is here to stay around. In many cases it is fast, it "does the job", and best of all: it is available "everywhere". The shell is used by makefiles, on every call to system(), and whatnot.

However, it is a real pain, implementations differ from the standards, some implementations still in use pre-date them, they leave room for undefined behaviour, and bugs in the implementations are nothing but unknown. You can't just specify a given shell interpreter and think you've dealt with the problem. Writing shell scripts that are portable among many platforms is a nightmare, if even possible.

Surprisingly, in spite of all that, a great amount of shell scripts appear to work flawlessly in many systems.

The switch from bash to dash as the default shell interpreter in Debian wasn't done without quite some work (more if you list archived bug reports), and the work ain't over.

For the following months I will be writing about different "bashisms" every Wednesday, hopefully helping people write slightly-more-portable shell scripts. The posts are going to be focused on widely-seen bashisms, probably ignoring those that Debian's policy defines as required to be implemented.

The term "bashism" must be understood as any feature or behaviour not required by SUSv3 (aka POSIX:2001), no matter what its origins are or even if the behaviour is not exhibited by the bash shell.

One of the key points is documenting the script's requirements, starting by specifying the right shell interpreter in the shebang.

Let's see what comes out of this experiment.

As a matter of fact, I have a few months worth of posts written already. All posts are going to be published at scheduled dates, just like this very post.

3 December 2012

Raphael Geissert: Some things you wanted to know about http.debian.net

After quite a bit of, very welcome, feedback I've put together a FAQ page in an attempt to respond to the most common questions about http.debian.net.

Emails have been accumulating for a few weeks now, but I will get to them. So please be patient if you send me an email, or if you have sent me one.

Next.

Previous.